Exploring Shallow Answer Ranking Features in Cross-Lingual and Monolingual Factoid Question Answering
نویسندگان
چکیده
Answer ranking is critical to a QA (Question Answering) system because it determines the final system performance. In this paper, we explore the behavior of shallow ranking features under different conditions. The features are easy to implement and are also suitable when complex NLP techniques or resources are not available for monolingual or cross-lingual tasks. We analyze six shallow ranking features, namely, SCO-QAT, keyword overlap, density, IR score, mutual information score, and answer frequency. SCO-QAT (Sum of Co-occurrence of Question and Answer Terms) is a new feature proposed by us that performed well in NTCIR CLQA. It is a co-occurrence based feature that does not need extra knowledge, word-ignoring heuristic rules, or special tools. Instead, for the whole corpus, SCO-QAT calculates co-occurrence scores based solely on the passage retrieval results. Our experiments show that there is no perfect shallow ranking feature for every condition. SCO-QAT performs the best in C-C (Chinese-Chinese) QA, but it is not a good choice in E-C (English-Chinese) QA. Overall, Frequency is the best choice for E-C QA, but its performance is impaired when translation noise is present. We also found that passage depth has little impact on shallow ranking features, and that a proper answer filter with fined-grained answer types is important for E-C QA. We measured the performance of answer ranking in terms of a newly proposed metric EAA (Expected Answer Accuracy) to cope with cases of answers that have the same score after ranking. Department of Computer Science, National Tsing-Hua University, Taiwan, R.O.C, 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan, R.O.C. Institute of Information Science, Academia Sinica, Taiwan, R.O.C, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan, R.O.C. The author for correspondence is Wen-Lian Hsu. E-mail: {aska, rog, hsu}@iis.sinica.edu.tw Cheng-Wei Lee et al.
منابع مشابه
ارایه یک پیکره پرسش و پاسخ مذهبی در زبان فارسی
Question answering system is a field in natural language processing and information retrieval noticed by researchers in these decades. Due to a growing interest in this field of research, the need to have appropriate data sources is perceived. Most researches about developing question answering corpus area have been done in English so far, but in other languages as Persian, the lack of these co...
متن کاملBootstrap Pattern Learning for Open-Domain CLQA
We describe Javelin, a Cross-lingual Question Answering system which participated in the NTCIR-8 ACLIA evaluation and which is designed to work on any type of question, including factoid and complex questions. The key technical contribution of this paper is a minimally supervised bootstrapping approach to generating lexicosyntactic patterns used for answer extraction. The preliminary evaluation...
متن کاملThe LIA at QA@CLEF-2006
This article presents the first participation of the Laboratoire Informatique d’Avignon (LIA) to the Cross Language Evaluation Forum (CLEF). LIA participated to the monolingual Question Answering (QA) track dedicated to French language, and to the crosslingual English to French QA track. Two runs for each track were submitted. English questions were first translated and then answered by using t...
متن کاملSpinning Straw into Gold: Using Free Text to Train Monolingual Alignment Models for Non-factoid Question Answering
Monolingual alignment models have been shown to boost the performance of question answering systems by ”bridging the lexical chasm” between questions and answers. The main limitation of these approaches is that they require semistructured training data in the form of question-answer pairs, which is difficult to obtain in specialized domains or lowresource languages. We propose two inexpensive m...
متن کاملPredicting Answer Location Using Shallow Semantic Analogical Reasoning in a Factoid Question Answering System
In this paper we report our work on a factoid question answering task that avoids namedentity recognition tool in the answer selection process. We use semantic analogical reasoning to find the location of the final answer from a textual passage.We demonstrate that without employing any linguistic tools during the answer selection process, our approach achieves a better accuracy than a typical f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCLCLP
دوره 13 شماره
صفحات -
تاریخ انتشار 2008